Improvements of Hidden Chunk Models
نویسندگان
چکیده
The statistical properties of segments [8] using a specific acoustic model called the hidden chunk model (HCM) is investigated. We call the sequence of feature vectors assigned to a segment a chunk of length `. The HCM still assumes that the feature vectors are statistically independent. In contrast to hidden Markov model (HMM) we introduce emission probabilities which depend on `. Segment error rates (SERs) are calculated on a database with over 33 million chunks aligned to 607 segments. The HCM achieves more than 10 % absolute improvement in SER compared to the HMM. Based on the estimated Shannon’s entropy, the proposed HCM model paves the way to create acoustic models which are heading towards the lowest possible SER.
منابع مشابه
Tagging Complex Non-Verbal German Chunks with Conditional Random Fields
We report on chunk tagging methods for German that recognize complex non-verbal phrases using structural chunk tags with Conditional Random Fields (CRFs). This state-of-the-art method for sequence classification achieves 93.5% accuracy on newspaper text. For the same task, a classical trigram tagger approach based on Hidden Markov Models reaches a baseline of 88.1%. CRFs allow for a clean and p...
متن کاملBitext Alignment for Statistical Machine Translation
Bitext alignment is the task of finding translation equivalence between documents in two languages, collections of which are commonly known as bitext. This dissertation addresses the problems of statistical alignment at various granularities from sentence to word with the goal of creating Statistical Machine Translation (SMT) systems. SMT systems are statistical pattern processors based on para...
متن کاملمدل ترجمه عبارت-مرزی با استفاده از برچسبهای کمعمق نحوی
Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...
متن کاملError-driven HMM-based Chunk Tagger with Context-dependent Lexicon
This paper proposes a new error-driven HMMbased text chunk tagger with context-dependent lexicon. Compared with standard HMM-based tagger, this tagger uses a new Hidden Markov Modelling approach which incorporates more contextual information into a lexical entry. Moreover, an error-driven learning approach is adopted to decrease the memory requirement by keeping only positive lexical entries an...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010